Sparse Learning for Large-Scale and High-Dimensional Data: A Randomized Convex-Concave Optimization Approach

نویسندگان

Lijun Zhang

Tianbao Yang

Rong Jin

Zhi-Hua Zhou

چکیده

In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer. Under the assumption that there exists a (approximately) sparse solution with high classification accuracy, we argue that the dual solution is also sparse or approximately sparse. The fact that both primal and dual solutions are sparse motivates us to develop a randomized approach for a general convex-concave optimization problem. Specifically, the proposed approach combines the strength of random projection with that of sparse learning: it utilizes random projection to reduce the dimensionality, and introduces l1-norm regularization to alleviate the approximation error caused by random projection. Theoretical analysis shows that under favored conditions, the randomized algorithm can accurately recover the optimal solutions to the convex-concave optimization problem (i.e., recover both the primal and dual solutions).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding sparse solutions to problems with convex constraints via concave programming

In this work, we consider a class of nonlinear optimization problems with convex constraints with the aim of computing sparse solutions. This is an important task arising in various fields such as machine learning, signal processing, data analysis. We adopt a concave optimization-based approach, we define an effective version of the Frank-Wolfe algorithm, and we prove the global convergence of ...

متن کامل

Mammalian Eye Gene Expression Using Support Vector Regression to Evaluate a Strategy for Detecting Human Eye Disease

Background and purpose: Machine learning is a class of modern and strong tools that can solve many important problems that nowadays humans may be faced with. Support vector regression (SVR) is a way to build a regression model which is an incredible member of the machine learning family. SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning appr...

متن کامل

Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

The ℓ1-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1-penalized estimators for convex loss functions of a general form, including ...

متن کامل

Sparse estimation of high-dimensional correlation matrices

Estimating covariations of variables for high dimensional data is important for understanding their relations. Recent years have seen several attempts to estimate covariance matrices with sparsity constraints. A new convex optimization formulation for estimating correlation matrices, which are scale invariant, is proposed as opposed to covariance matrices. The constrained optimization problem i...

متن کامل

Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python

We describe a new library named picasso, which implements a unified framework of pathwise coordinate optimization for a variety of sparse learning problems (e.g., sparse linear regression, sparse logistic regression, sparse Poisson regression and sparse square root loss linear regression), combined with efficient active set selection strategies. Besides, the library allows users to choose diffe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Sparse Learning for Large-Scale and High-Dimensional Data: A Randomized Convex-Concave Optimization Approach

نویسندگان

چکیده

منابع مشابه

Finding sparse solutions to problems with convex constraints via concave programming

Mammalian Eye Gene Expression Using Support Vector Regression to Evaluate a Strategy for Detecting Human Eye Disease

Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications

Sparse estimation of high-dimensional correlation matrices

Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python

عنوان ژورنال:

اشتراک گذاری